Hybrid Text Summarization Method based on the TF Method and the Lead Method

نویسندگان

  • Kai Ishikawa
  • Shinichi Ando
  • Akitoshi Okumura
چکیده

This paper describes a hybrid text summarization method based on a TF-based sentence extraction method and a LEAD sentence extraction method. The LEAD method is known to be effective than other methods for document summarization of newspapers in lower summarization (output-to-input) ratio. In order to combine the LEAD method with the TF method, we used a rectangular distribution function that determines the importance of sentences according to their position in a document. With our method, the importance of a sentence is determined by multiplying the TF-based score and the distribution function. We conducted open test evaluation using the formal run test data of sentence extraction sub-task in NTCIR-2 Workshop TSC task (30 newspaper articles). The proposed method was tested by the average values of F-measure for 10%, 30%, and 50% summaries, and proved 34.1% for TF method, 39.1% for LEAD method, and 42.4% for the proposed method.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Text Summarization Using Cuckoo Search Optimization Algorithm

Today, with rapid growth of the World Wide Web and creation of Internet sites and online text resources, text summarization issue is highly attended by various researchers. Extractive-based text summarization is an important summarization method which is included of selecting the top representative sentences from the input document. When, we are facing into large data volume documents, the extr...

متن کامل

Graph Hybrid Summarization

One solution to process and analysis of massive graphs is summarization. Generating a high quality summary is the main challenge of graph summarization. In the aims of generating a summary with a better quality for a given attributed graph, both structural and attribute similarities must be considered. There are two measures named density and entropy to evaluate the quality of structural and at...

متن کامل

Biogeography-Based Optimization Algorithm for Automatic Extractive Text Summarization

    Given the increasing number of documents, sites, online sources, and the users’ desire to quickly access information, automatic textual summarization has caught the attention of many researchers in this field. Researchers have presented different methods for text summarization as well as a useful summary of those texts including relevant document sentences. This study select...

متن کامل

Multiple Post Microblog Summarization

The use of microblogs such as Twitter has increased incredibly over the past few years. Because of the public nature and sheer volume of text from these constantly changing microblogs, it is often difficult to fully understand what is being said about various topics. A method for summarizing popular topics of microblogs has been proposed but its summaries are only one sentence or phrase in leng...

متن کامل

An Optimal Approach to Local and Global Text Coherence Evaluation Combining Entity-based, Graph-based and Entropy-based Approaches

Text coherence evaluation becomes a vital and lovely task in Natural Language Processing subfields, such as text summarization, question answering, text generation and machine translation. Existing methods like entity-based and graph-based models are engaging with nouns and noun phrases change role in sequential sentences within short part of a text. They even have limitations in global coheren...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2001